Bayesian and Approximate Bayesian Modeling of Human Sequential Decision-Making on the Multi-Armed Bandit Problem
نویسندگان
چکیده
In this paper we investigate human exploration/exploitation behavior in a sequential-decision making task. Previous studies have suggested that people are suboptimal at scheduling exploration, and heuristic decision strategies are better predictors of human choices than the optimal model. By incorporating more realistic assumptions about subject’s knowledge and limitations into models of belief updating, we show that optimal Bayesian and approximate Bayesian models of human behavior for the Multi-Armed Bandit Problem (MAB) outperform the best heuristic methods on experimental data for 2-arm, 3-arm, and 4-arm bandit problems. Moreover, we show that Bayesian modeling is more consistent to the exploratory and exploitative human behavior by disaggregating the fitting performance of decision sequences into several phases.
منابع مشابه
Bayesian and Approximate Bayesian Modeling of Human Sequential Decision-Making on the Multi-Armed Bandit Problem
In this paper we investigate human exploration/exploitation behavior in sequential-decision making tasks. Previous studies have suggested that people are suboptimal at scheduling exploration, and heuristic decision strategies are better predictors of human choices than the optimal model. By incorporating more realistic assumptions about subject’s knowledge and limitations into models of belief ...
متن کاملBayesian Modeling of Human Sequential Decision-Making on the Multi-Armed Bandit Problem
In this paper we investigate human exploration/exploitation behavior in sequential-decision making tasks. Previous studies have suggested that people are suboptimal at scheduling exploration, and heuristic decision strategies are better predictors of human choices than the optimal model. By incorporating more realistic assumptions about subject’s knowledge and limitations into models of belief ...
متن کاملBayesian Adaptive Management With Learning∗
Learning and taking action are generally treated as separate choices in adaptive management and decision-making under uncertainty. When management actions produce information, this dichotomous choice is no longer optimal the value of information and immediate returns should be considered simultaneously. Pro ling of trade partners for invasive species risk is an example of such an endogenous lea...
متن کاملA Bayesian analysis of human decision-making on bandit problems
The bandit problem is a dynamic decision-making task that is simply described, well-suited to controlled laboratory study, and representative of a broad class of real-world problems. In bandit problems, people must choose between a set of alternatives, each with different unknown reward rates, to maximize the total reward they receive over a fixed number of trials. A key feature of the task is ...
متن کاملStructure Learning in Human Sequential Decision-Making
Studies of sequential decision-making in humans frequently find suboptimal performance relative to an ideal actor that has perfect knowledge of the model of how rewards and events are generated in the environment. Rather than being suboptimal, we argue that the learning problem humans face is more complex, in that it also involves learning the structure of reward generation in the environment. ...
متن کامل